SlideShare a Scribd company logo
1 of 41
A transcriptomic analysis of photomorphogenesis in
Arabidopsis Thaliana
Hugh Shanahan,
Department of Computer Science,
Royal Holloway,
University of London
                                           UCC,
                                  Department of Microbiology
                                     19 September 2007
Outline

• Introduction :- A. Thaliana as a model organism


• Photomorphogenesis


• A conservative estimate of differentially expressed genes


• A strategy for examining functional classification


• The picture of photomorphogenesis


• Where to go from here...
Why is Arabidopsis thaliana a good system ?

• Short lifetime - around 40 days.


• Large number of genes - ~25,500 (Drosophila around
  11,000 genes)


• Compact Genome - 125 Mbases (non-coding regions
  ~30-50%)


• Large number of array experiments (65 Affymetrix data
  sets, with 10’s of raw image files per data set at TAIR for
  example)


• A huge number of different strains - over half a million
  genotypes.
And why perhaps not so good....

• Not much Protein-Protein interaction data (Tandem Affinity Purification experiments
  on the way).


• The genomes of near neighbours have not been sequenced.


   • A. lytra, C. rubella, B. rapa and T. halophilia are planned or underway


• Not clear how representative genes are for other agriculturally relevant species


   • A. thaliana has a huge repertoire of Ubiquitination proteins.
Photomorphogenesis in
Arabidopsis thaliana

• Before exposure to light, seeding
  grows via skotomorphogenesis
  after germination - slow root
  growth, no growth in shoot apical
  meristem or cotyledon.


• When exposed to light, cotyledon
  grows through simple
  reproduction (control).


• Meristem (stem cells) grow by
  differentiation.


• Meristem source of true leaves.


• Little understood about process.
Photomorphogenesis in
Arabidopsis thaliana

• Before exposure to light, seeding
  grows via skotomorphogenesis
  after germination - slow root
  growth, no growth in shoot apical
  meristem or cotyledon.


• When exposed to light, cotyledon
  grows through simple
  reproduction (control).


• Meristem (stem cells) grow by
  differentiation.


• Meristem source of true leaves.


• Little understood about process.
Photomorphogenesis in
Arabidopsis thaliana

• Before exposure to light, seeding
  grows via skotomorphogenesis
  after germination - slow root
  growth, no growth in shoot apical
  meristem or cotyledon.


• When exposed to light, cotyledon
  grows through simple
  reproduction (control).


• Meristem (stem cells) grow by
  differentiation.


• Meristem source of true leaves.


• Little understood about process.
The data

• RNA material was gathered from the shoot apical meristem and cotyledon of
  Arabidopsis seedlings at


   • 0 hour (in darkness)


   • 1 and 6 hours (Cot and Mer with replicates)


   • 2, 24, 48 and 72 hours (Mer only)


• Samples hybridised with Affymetrix ATH1 GeneChip array.


• No amplification of RNA material !
Strategy

• Construct stringent test to determine genes which are clearly differentially
  expressed.


• Identify kinetic behaviour of different classes of differentially genes (i.e. try and find a
  time line of events).


• Identify functional groupings of genes and then examine how all the genes in that
  functional grouping behave (i.e. including those that are not differentially expressed
  according to our strict criteria).
Finding differentially expressed genes

• Look at three different normalisations for Affymetrix data


   • GCRMA


   • MAS5


   • VSN


• Only consider genes that are differentially expressed in all three normalisations as
  being significant.
Test for significance

• Apply two-way ANOVA test. Look
  for significance with respect to


   • tissue


   • time


   • time and tissue


• Compute F-Ratio


• Only use data with two replicates
  (i.e. Cot and Mer at 0, 1 and 6
  hours)
Finite sample size :-
bootstrapping	

• 2 replicates for the ANOVA data
  set.


• Cannot trust a p-value from such
  data !


• Solution :- create a large set of
  artificial data by randomly
  selecting expression values from
  all of the data.


• Compute histogram of resulting F-
  values for ANOVA test to
  determine a p-value.
Finite sample size :-
bootstrapping	

• 2 replicates for the ANOVA data
  set.


• Cannot trust a p-value from such
  data !


• Solution :- create a large set of
  artificial data by randomly
  selecting expression values from
  all of the data.


• Compute histogram of resulting F-
  values for ANOVA test to
  determine a p-value.
Finite sample size :-
bootstrapping	

• 2 replicates for the ANOVA data
  set.


• Cannot trust a p-value from such
  data !


• Solution :- create a large set of
  artificial data by randomly
  selecting expression values from
  all of the data.


• Compute histogram of resulting F-
  values for ANOVA test to
  determine a p-value.
False Detection Rate

• Bonferroni Correction is very
  conservative.


• Estimate FDR by plotting a
  histogram of the p-values.


• Fix FDR to 5% and set p-value.


• Important step : employ Present/
  Absent filter in MAS5 to filter out
  genes. Substantial improvement in
  results.


• Final step : remove all genes with
  fold change less than 2
Initial Results

• Selected 5,620 genes (out of 22,810).


   • (Very conservatively) 1/4 of the transcriptome is differentially expressed during
     photomorphogenesis.


• Majority selected through time variation (2/3 time, 1/3 tissue).


• Very small number selected using time-tissue variation (10).
Functional Classification :- it should be easy....

• Many genes in Arabidopsis have some kind of functional annotation.


• Use Gene Ontology to give a structured functional annotation.


• Enumerate numbers of genes for a given annotation.


• Compute probability of over or under-representation using hyper-geometric
  distribution.
Functional Classification :- but it isn’t....

• Gene Ontology are useful but far too specific at its lowest nodes - False Discovery
  Rate calculation.


• Initially interested in general picture, what is the highest level annotation ?


• GO slim should cover more general cases, however annotations of genes can have
  multiple parents, e.g. a gene with kinase function and binds to DNA will sit in both
  classes.


• Ultimately, we developed our own general functional annotation.


• Lesson :- GO has a huge amount of information, but when looking at the big picture,
  you need to make the decisions !
Functional Classification :- but it isn’t....

• Gene Ontology are useful but far too specific at its lowest nodes - False Discovery
  Rate calculation.


• Initially interested in general picture, what is the highest level annotation ?


• GO slim should cover more general cases, however annotations of genes can have
  multiple parents, e.g. a gene with kinase function and binds to DNA will sit in both
  classes.


• Ultimately, we developed our own general functional annotation.


• Lesson :- GO has a huge amount of information, but when looking at the big picture,
  you need to make the decisions !
Functional Classification :- but it isn’t....

• Gene Ontology are useful but far too specific at its lowest nodes - False Discovery
  Rate calculation.


• Initially interested in general picture, what is the highest level annotation ?


• GO slim should cover more general cases, however annotations of genes can have
  multiple parents, e.g. a gene with kinase function and binds to DNA will sit in both
  classes.


• Ultimately, we developed our own general functional annotation.


• Lesson :- GO has a huge amount of information, but when looking at the big picture,
  you need to make the decisions !
Example
log(p)




Example
Over-representation




             log(p)




Example
Over-representation




              log(p)
          Under-representation




Example
Clustering :- Understanding the time behaviour

• Important lesson :- cluster early, cluster often.


• While genes have been selected using data at 0,1,6 hours, all data is used for
  clustering.


• Results quoted use K-Means (20 clusters) but checked with varying cluster size and
  with Hierarchical Clustering (very different clustering algorithm) to check if clusters
  are consistent.


• For each cluster, examine functional classes and explore any possible over-
  representations.
K-means clusters (reordered)
Yes, but what does it
                        K-means clusters (reordered)
             mean ?
First Phase

• Transcription Factors


• Ubiquitination


• Kinases


• More down-regulation than up.
First Phase

• Transcription Factors


• Ubiquitination


• Kinases


• More down-regulation than up.
Regulator (?) classes   Including non-selected genes
Second Phase

• Ribosomal activity


• cell cycle


• hormone-related activity
Second Phase

• Ribosomal activity


• cell cycle


• hormone-related activity
Ribosome working
        overtime
Hormones

• Auxin, Ethylene promote growth
  (elongation).


• Cytokinin repress elongation.


• Cannot track hormone concs.


• Follow activity of genes regulated
  by hormones (strictly differentially
  expressed).


• Auxin, Ethylene “repressed”.


• Cytokinin “promoted”.
Third Phase

• Photosynthesis


• Cell wall loosening
Third Phase

• Photosynthesis


• Cell wall loosening
Cell Wall

• Plant cells have rigid cell walls.


• Expansion implies that cell walls
  must become less rigid.


• Complicated process between cell
  wall modification and internal
  turgor pressure.


• Nonetheless, see late expression
  in genes controlling this behaviour.
Summary of what’s been seen

• After constructing a set of genes which are strictly differentially expressed we find:


• An early burst (0-1 hours after exposure to light) of genes in the meristem that are
  regulatory in nature and are in general down rather than up-regulated.


• Around 6 hours after exposure to light evidence for cell division and repression of
  growth.


• At later times, parts of meristem are already starting to behave like leaves and we
  see growth through expansion rather than division (up-regulation of relevant
  hormone-related genes, down-regulation of ribosomal genes).
Where to go from here

• So far we’ve put together a picture of genes up and down regulated and looking at
  their classification fits in with a picture of particular types of growth.


• Useful at middle and late times (6 hours and beyond)


• We see sets of transcription factors, MAP kinases and Ubiquitination-related genes
  which are over-represented.


• Can we identify targets of these regulators ?


• Start off with co-regulated sets and see if we have common upstream promoter
  elements.


• How conserved are the above mechanisms elsewhere ?
Acknowledgements
Acknowledgements

School of Biological Sciences,   Computer Science,
       Royal Holloway             Royal Holloway

                                    • Saul Hazeldine
 • Enrique López-Juez
 • Edyta Dillon
 • Safina Khan                    Department of Molecular Genetics,
 • Zoltan Magyar                        Ghent University
 • Laszlo Bögre
                                    • Gerrit Beemster

                                 Institute of Biotechnology,
                                  University of Cambridge


                                    • James A. H. Murray

More Related Content

What's hot

Kyle Jensen MIT Ph.D. Thesis Defense
Kyle Jensen MIT Ph.D. Thesis DefenseKyle Jensen MIT Ph.D. Thesis Defense
Kyle Jensen MIT Ph.D. Thesis Defense
Kyle Jensen
 

What's hot (7)

Comparing bacterial isolates - T.Seemann - IMB winter school 2016 - fri 8 jul...
Comparing bacterial isolates - T.Seemann - IMB winter school 2016 - fri 8 jul...Comparing bacterial isolates - T.Seemann - IMB winter school 2016 - fri 8 jul...
Comparing bacterial isolates - T.Seemann - IMB winter school 2016 - fri 8 jul...
 
RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1
 
Phylogenomics and the diversification of microbes.
Phylogenomics and the diversification of microbes.Phylogenomics and the diversification of microbes.
Phylogenomics and the diversification of microbes.
 
Part 6 of RNA-seq for DE analysis: Detecting biology from differential expres...
Part 6 of RNA-seq for DE analysis: Detecting biology from differential expres...Part 6 of RNA-seq for DE analysis: Detecting biology from differential expres...
Part 6 of RNA-seq for DE analysis: Detecting biology from differential expres...
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijay
 
Single-cell RNA-seq tutorial
Single-cell RNA-seq tutorialSingle-cell RNA-seq tutorial
Single-cell RNA-seq tutorial
 
Kyle Jensen MIT Ph.D. Thesis Defense
Kyle Jensen MIT Ph.D. Thesis DefenseKyle Jensen MIT Ph.D. Thesis Defense
Kyle Jensen MIT Ph.D. Thesis Defense
 

Similar to Photomorphogenesis talk

Photomorphogenesis talk
Photomorphogenesis talkPhotomorphogenesis talk
Photomorphogenesis talk
Hugh Shanahan
 
Lab2_3_Lecture_DNA_PCR (3).pptx
Lab2_3_Lecture_DNA_PCR (3).pptxLab2_3_Lecture_DNA_PCR (3).pptx
Lab2_3_Lecture_DNA_PCR (3).pptx
karlos64
 

Similar to Photomorphogenesis talk (20)

Photomorphogenesis talk
Photomorphogenesis talkPhotomorphogenesis talk
Photomorphogenesis talk
 
Arnab kumar de
Arnab kumar deArnab kumar de
Arnab kumar de
 
Lecture 7 gwas full
Lecture 7 gwas fullLecture 7 gwas full
Lecture 7 gwas full
 
Third Generation Sequencing
Third Generation Sequencing Third Generation Sequencing
Third Generation Sequencing
 
Comparative Genomics and Visualisation - Part 1
Comparative Genomics and Visualisation - Part 1Comparative Genomics and Visualisation - Part 1
Comparative Genomics and Visualisation - Part 1
 
Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07
 
20140710 6 c_mason_ercc2.0_workshop
20140710 6 c_mason_ercc2.0_workshop20140710 6 c_mason_ercc2.0_workshop
20140710 6 c_mason_ercc2.0_workshop
 
Arabidopsis in molecular biology
Arabidopsis in molecular biologyArabidopsis in molecular biology
Arabidopsis in molecular biology
 
2015 beacon-metagenome-tutorial
2015 beacon-metagenome-tutorial2015 beacon-metagenome-tutorial
2015 beacon-metagenome-tutorial
 
Transcriptomics,techniqes, applications.pdf
Transcriptomics,techniqes, applications.pdfTranscriptomics,techniqes, applications.pdf
Transcriptomics,techniqes, applications.pdf
 
Lab2_3_Lecture_DNA_PCR (3).pptx
Lab2_3_Lecture_DNA_PCR (3).pptxLab2_3_Lecture_DNA_PCR (3).pptx
Lab2_3_Lecture_DNA_PCR (3).pptx
 
Gene expression introduction
Gene expression introductionGene expression introduction
Gene expression introduction
 
Why Your Microbiome Analysis is Wrong
Why Your Microbiome Analysis is WrongWhy Your Microbiome Analysis is Wrong
Why Your Microbiome Analysis is Wrong
 
Introduction to Apollo: A webinar for the i5K Research Community
Introduction to Apollo: A webinar for the i5K Research CommunityIntroduction to Apollo: A webinar for the i5K Research Community
Introduction to Apollo: A webinar for the i5K Research Community
 
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from UnculturedMicrobial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
 
Biotech labs - restriction digest and transformation
Biotech labs - restriction digest and transformationBiotech labs - restriction digest and transformation
Biotech labs - restriction digest and transformation
 
NAISTビッグデータシンポジウム - バイオ久保先生
NAISTビッグデータシンポジウム - バイオ久保先生NAISTビッグデータシンポジウム - バイオ久保先生
NAISTビッグデータシンポジウム - バイオ久保先生
 
Genomics_Aishwarya Teli.pptx
Genomics_Aishwarya Teli.pptxGenomics_Aishwarya Teli.pptx
Genomics_Aishwarya Teli.pptx
 
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from UnculturedMicrobial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
 
2014 naples
2014 naples2014 naples
2014 naples
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Photomorphogenesis talk

  • 1. A transcriptomic analysis of photomorphogenesis in Arabidopsis Thaliana Hugh Shanahan, Department of Computer Science, Royal Holloway, University of London UCC, Department of Microbiology 19 September 2007
  • 2. Outline • Introduction :- A. Thaliana as a model organism • Photomorphogenesis • A conservative estimate of differentially expressed genes • A strategy for examining functional classification • The picture of photomorphogenesis • Where to go from here...
  • 3. Why is Arabidopsis thaliana a good system ? • Short lifetime - around 40 days. • Large number of genes - ~25,500 (Drosophila around 11,000 genes) • Compact Genome - 125 Mbases (non-coding regions ~30-50%) • Large number of array experiments (65 Affymetrix data sets, with 10’s of raw image files per data set at TAIR for example) • A huge number of different strains - over half a million genotypes.
  • 4. And why perhaps not so good.... • Not much Protein-Protein interaction data (Tandem Affinity Purification experiments on the way). • The genomes of near neighbours have not been sequenced. • A. lytra, C. rubella, B. rapa and T. halophilia are planned or underway • Not clear how representative genes are for other agriculturally relevant species • A. thaliana has a huge repertoire of Ubiquitination proteins.
  • 5. Photomorphogenesis in Arabidopsis thaliana • Before exposure to light, seeding grows via skotomorphogenesis after germination - slow root growth, no growth in shoot apical meristem or cotyledon. • When exposed to light, cotyledon grows through simple reproduction (control). • Meristem (stem cells) grow by differentiation. • Meristem source of true leaves. • Little understood about process.
  • 6. Photomorphogenesis in Arabidopsis thaliana • Before exposure to light, seeding grows via skotomorphogenesis after germination - slow root growth, no growth in shoot apical meristem or cotyledon. • When exposed to light, cotyledon grows through simple reproduction (control). • Meristem (stem cells) grow by differentiation. • Meristem source of true leaves. • Little understood about process.
  • 7. Photomorphogenesis in Arabidopsis thaliana • Before exposure to light, seeding grows via skotomorphogenesis after germination - slow root growth, no growth in shoot apical meristem or cotyledon. • When exposed to light, cotyledon grows through simple reproduction (control). • Meristem (stem cells) grow by differentiation. • Meristem source of true leaves. • Little understood about process.
  • 8. The data • RNA material was gathered from the shoot apical meristem and cotyledon of Arabidopsis seedlings at • 0 hour (in darkness) • 1 and 6 hours (Cot and Mer with replicates) • 2, 24, 48 and 72 hours (Mer only) • Samples hybridised with Affymetrix ATH1 GeneChip array. • No amplification of RNA material !
  • 9. Strategy • Construct stringent test to determine genes which are clearly differentially expressed. • Identify kinetic behaviour of different classes of differentially genes (i.e. try and find a time line of events). • Identify functional groupings of genes and then examine how all the genes in that functional grouping behave (i.e. including those that are not differentially expressed according to our strict criteria).
  • 10. Finding differentially expressed genes • Look at three different normalisations for Affymetrix data • GCRMA • MAS5 • VSN • Only consider genes that are differentially expressed in all three normalisations as being significant.
  • 11. Test for significance • Apply two-way ANOVA test. Look for significance with respect to • tissue • time • time and tissue • Compute F-Ratio • Only use data with two replicates (i.e. Cot and Mer at 0, 1 and 6 hours)
  • 12. Finite sample size :- bootstrapping • 2 replicates for the ANOVA data set. • Cannot trust a p-value from such data ! • Solution :- create a large set of artificial data by randomly selecting expression values from all of the data. • Compute histogram of resulting F- values for ANOVA test to determine a p-value.
  • 13. Finite sample size :- bootstrapping • 2 replicates for the ANOVA data set. • Cannot trust a p-value from such data ! • Solution :- create a large set of artificial data by randomly selecting expression values from all of the data. • Compute histogram of resulting F- values for ANOVA test to determine a p-value.
  • 14. Finite sample size :- bootstrapping • 2 replicates for the ANOVA data set. • Cannot trust a p-value from such data ! • Solution :- create a large set of artificial data by randomly selecting expression values from all of the data. • Compute histogram of resulting F- values for ANOVA test to determine a p-value.
  • 15. False Detection Rate • Bonferroni Correction is very conservative. • Estimate FDR by plotting a histogram of the p-values. • Fix FDR to 5% and set p-value. • Important step : employ Present/ Absent filter in MAS5 to filter out genes. Substantial improvement in results. • Final step : remove all genes with fold change less than 2
  • 16. Initial Results • Selected 5,620 genes (out of 22,810). • (Very conservatively) 1/4 of the transcriptome is differentially expressed during photomorphogenesis. • Majority selected through time variation (2/3 time, 1/3 tissue). • Very small number selected using time-tissue variation (10).
  • 17. Functional Classification :- it should be easy.... • Many genes in Arabidopsis have some kind of functional annotation. • Use Gene Ontology to give a structured functional annotation. • Enumerate numbers of genes for a given annotation. • Compute probability of over or under-representation using hyper-geometric distribution.
  • 18. Functional Classification :- but it isn’t.... • Gene Ontology are useful but far too specific at its lowest nodes - False Discovery Rate calculation. • Initially interested in general picture, what is the highest level annotation ? • GO slim should cover more general cases, however annotations of genes can have multiple parents, e.g. a gene with kinase function and binds to DNA will sit in both classes. • Ultimately, we developed our own general functional annotation. • Lesson :- GO has a huge amount of information, but when looking at the big picture, you need to make the decisions !
  • 19. Functional Classification :- but it isn’t.... • Gene Ontology are useful but far too specific at its lowest nodes - False Discovery Rate calculation. • Initially interested in general picture, what is the highest level annotation ? • GO slim should cover more general cases, however annotations of genes can have multiple parents, e.g. a gene with kinase function and binds to DNA will sit in both classes. • Ultimately, we developed our own general functional annotation. • Lesson :- GO has a huge amount of information, but when looking at the big picture, you need to make the decisions !
  • 20. Functional Classification :- but it isn’t.... • Gene Ontology are useful but far too specific at its lowest nodes - False Discovery Rate calculation. • Initially interested in general picture, what is the highest level annotation ? • GO slim should cover more general cases, however annotations of genes can have multiple parents, e.g. a gene with kinase function and binds to DNA will sit in both classes. • Ultimately, we developed our own general functional annotation. • Lesson :- GO has a huge amount of information, but when looking at the big picture, you need to make the decisions !
  • 23. Over-representation log(p) Example
  • 24. Over-representation log(p) Under-representation Example
  • 25. Clustering :- Understanding the time behaviour • Important lesson :- cluster early, cluster often. • While genes have been selected using data at 0,1,6 hours, all data is used for clustering. • Results quoted use K-Means (20 clusters) but checked with varying cluster size and with Hierarchical Clustering (very different clustering algorithm) to check if clusters are consistent. • For each cluster, examine functional classes and explore any possible over- representations.
  • 27. Yes, but what does it K-means clusters (reordered) mean ?
  • 28. First Phase • Transcription Factors • Ubiquitination • Kinases • More down-regulation than up.
  • 29. First Phase • Transcription Factors • Ubiquitination • Kinases • More down-regulation than up.
  • 30. Regulator (?) classes Including non-selected genes
  • 31. Second Phase • Ribosomal activity • cell cycle • hormone-related activity
  • 32. Second Phase • Ribosomal activity • cell cycle • hormone-related activity
  • 33. Ribosome working overtime
  • 34. Hormones • Auxin, Ethylene promote growth (elongation). • Cytokinin repress elongation. • Cannot track hormone concs. • Follow activity of genes regulated by hormones (strictly differentially expressed). • Auxin, Ethylene “repressed”. • Cytokinin “promoted”.
  • 35. Third Phase • Photosynthesis • Cell wall loosening
  • 36. Third Phase • Photosynthesis • Cell wall loosening
  • 37. Cell Wall • Plant cells have rigid cell walls. • Expansion implies that cell walls must become less rigid. • Complicated process between cell wall modification and internal turgor pressure. • Nonetheless, see late expression in genes controlling this behaviour.
  • 38. Summary of what’s been seen • After constructing a set of genes which are strictly differentially expressed we find: • An early burst (0-1 hours after exposure to light) of genes in the meristem that are regulatory in nature and are in general down rather than up-regulated. • Around 6 hours after exposure to light evidence for cell division and repression of growth. • At later times, parts of meristem are already starting to behave like leaves and we see growth through expansion rather than division (up-regulation of relevant hormone-related genes, down-regulation of ribosomal genes).
  • 39. Where to go from here • So far we’ve put together a picture of genes up and down regulated and looking at their classification fits in with a picture of particular types of growth. • Useful at middle and late times (6 hours and beyond) • We see sets of transcription factors, MAP kinases and Ubiquitination-related genes which are over-represented. • Can we identify targets of these regulators ? • Start off with co-regulated sets and see if we have common upstream promoter elements. • How conserved are the above mechanisms elsewhere ?
  • 41. Acknowledgements School of Biological Sciences, Computer Science, Royal Holloway Royal Holloway • Saul Hazeldine • Enrique López-Juez • Edyta Dillon • Safina Khan Department of Molecular Genetics, • Zoltan Magyar Ghent University • Laszlo Bögre • Gerrit Beemster Institute of Biotechnology, University of Cambridge • James A. H. Murray

Editor's Notes